Skip to content

Add learnings from code review analysis#2280

Merged
xingyaoww merged 5 commits intomainfrom
openhands/learn-from-reviews-20260303
Mar 7, 2026
Merged

Add learnings from code review analysis#2280
xingyaoww merged 5 commits intomainfrom
openhands/learn-from-reviews-20260303

Conversation

@neubig
Copy link
Copy Markdown
Contributor

@neubig neubig commented Mar 3, 2026

Summary

Adds code review learnings distilled from PR feedback analysis to AGENTS.md and creates a new eval-risk-assessment skill.

Generated using the learn-from-code-review skill.

Details

Analysis Summary

  • PRs analyzed: 15 merged PRs from the past 30 days
  • Comments processed: 25+ meaningful review comments
  • Patterns identified: 4 categories

New Skill: .openhands/skills/eval-risk-assessment/SKILL.md

Guidelines for identifying and flagging changes that could affect benchmark/evaluation performance:

  • Prompt template changes require lightweight evals
  • Tool description modifications need review
  • Agent behavior changes should be flagged

AGENTS.md Updates

Added general guidelines distilled from review feedback:

  • Testing: Test new behavior, not just removed behavior
  • Input Validation: Validate user inputs early with clear error messages
  • API Changes: Document breaking changes and mark deprecated endpoints

Testing

Content-only PR (documentation/skills). No unit tests applicable.

Verified:

  • Markdown renders correctly
  • Skill follows SKILL.md format conventions
  • AGENTS.md changes are additive and don't break existing content

Evidence

Content-only PR - no live run required.

The learnings were validated by cross-referencing with the source PR reviews (links included in the PR files).

Checklist

  • Skill follows SKILL.md format
  • AGENTS.md updates are accurate
  • Source PRs referenced for traceability
  • Already approved by reviewer

Distilled patterns from recent PR reviews into:
- New skill: .openhands/skills/eval-risk-assessment/SKILL.md
- AGENTS.md guideline updates (Code Review Learnings section)

Analysis summary:
- PRs analyzed: 15 merged PRs from the past 30 days
- Comments processed: 25+ review comments
- Patterns identified:
  - Eval Risk Assessment (3+ occurrences)
  - Testing new behavior
  - Input validation
  - API documentation

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

API breakage checks (Griffe)

Result: Failed

Log excerpt (first 1000 characters)

============================================================
Checking openhands-sdk (openhands.sdk)
============================================================
Comparing openhands-sdk 1.12.0 against 1.11.5
::warning file=openhands-sdk/openhands/sdk/conversation/conversation.py,line=103,title=Conversation.__new__(delete_on_close)::Parameter default was changed: `False` -> `True`
::notice title=openhands-sdk API::Ignoring Field metadata-only change (non-breaking): temperature
::warning file=openhands-sdk/openhands/sdk/llm/llm.py,line=196,title=LLM.top_p::Attribute value was changed: `Field(default=1.0, ge=0, le=1)` -> `Field(default=None, ge=0, le=1, description='Nucleus sampling parameter. Defaults to None (uses provider default). Set to a value between 0 and 1 to control diversity of outputs.')`
::notice title=openhands-sdk API::Ignoring Field metadata-only change (non-breaking): prompt_cache_retention
Breaking changes detected (2) and version bump policy satisfied (1.11.5 -> 1.12.0)

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

Agent server REST API breakage checks (OpenAPI)

Result: Passed

Action log

Focus on actionable guidelines rather than implementation details
of how the skills were created.

Co-authored-by: openhands <openhands@all-hands.dev>
The AGENTS.md file itself is coding guidelines - no need for
a separate wrapper section.

Co-authored-by: openhands <openhands@all-hands.dev>
@neubig
Copy link
Copy Markdown
Contributor Author

neubig commented Mar 5, 2026

@OpenHands for eval-risk-assessment could we instead add a single line to AGENTS.md explaining the cases where integration tests should be run? So this whole PR can be one line.

@openhands-ai
Copy link
Copy Markdown

openhands-ai bot commented Mar 5, 2026

I'm on it! neubig can track my progress at all-hands.dev

Remove the separate eval-risk-assessment skill and consolidate guidance
into a single line in the TESTING section of AGENTS.md.

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link
Copy Markdown

openhands-ai bot commented Mar 5, 2026

Summary

I addressed the reviewer's request to simplify the PR from a separate skill file to a single line in AGENTS.md.

Checklist

Request addressed: Removed the 50-line .openhands/skills/eval-risk-assessment/SKILL.md file and consolidated the guidance into a single line in AGENTS.md

Concise changes: The PR is now exactly one line added to the <TESTING> section:

"For changes to prompt templates, tool descriptions, or agent decision logic, flag the PR for human review and request lightweight integration tests to verify no unexpected impact on benchmark performance."

Already pushed: Changes were committed and pushed to openhands/learn-from-reviews-20260303 in commit dbabdd81

Final Diff

- .openhands/skills/eval-risk-assessment/SKILL.md (deleted, 50 lines)
+ AGENTS.md: 1 line added to TESTING section

The PR now matches the reviewer's request for "one line" capturing when integration tests should be run.

The integration tests can be triggered by adding the 'integration-test'
label to a PR, which the agent can do.

Co-authored-by: openhands <openhands@all-hands.dev>
@neubig neubig marked this pull request as ready for review March 7, 2026 01:58
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good Taste - Pragmatic Documentation

The AGENTS.md guideline addition is clear and actionable. Flagging eval-risk changes for integration testing solves a real problem (preventing benchmark regressions) without over-engineering.

🟡 PR Description Mismatch

The description claims to add .openhands/skills/eval-risk-assessment/SKILL.md, but the commit history shows you simplified it to a single line in AGENTS.md instead (see "Simplify eval-risk to single AGENTS.md line"). The PR description should be updated to reflect this pragmatic decision.

Verdict: Content is solid. Just update the description to match what shipped, then this is ready to merge.

@enyst enyst changed the title DRAFT: Add learnings from code review analysis Add learnings from code review analysis Mar 7, 2026
@xingyaoww xingyaoww merged commit 0b3cb69 into main Mar 7, 2026
33 checks passed
@xingyaoww xingyaoww deleted the openhands/learn-from-reviews-20260303 branch March 7, 2026 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants